CS 294 - 1 A 1 : Naive Bayesian Classifier

نویسنده

  • Liwen Sun
چکیده

Settings. Our codes were written in Scala and compiled under Simple Build Tool (SBT). The programs were run on Mac OS. We test the effectiveness of our implementation in various aspects. If not mentioned explicitly, we adopt the following default settings. We report macroaveraged F1 measures, which were further averaged by ten-fold cross validations. We consider both “Bernoulli” and “Multinomial” models. We use as features all the words preprocessed by stemming and stop-word elimination. Later discussions may unveil why certain default choices are made.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CS 294-1: Assignment 1 Naive Bayes Classification with Improvements

The main objective of this assignment was to implement a Naive Bayes classifier and attempt certain improvements upon the vanilla version. A major challenge was to implement the classifier in Scala using the two libraries scalala and scalanlp. This report presents details regarding the different experiments I tried out, namely varying the smoothing parameter, feature selection, n-gram models an...

متن کامل

CS 294 - 1 Assignment 1 Report

Text classification has increasing potential applications in many aspects of information world, such as recommender systems and customer service. The goal of this assignment is to apply Naive Bayes classifier to a data set of labeled textual movie reviews and practice Scala/ScalaNLP. The data set “Polarity dataset v2.0” is from http://www.cs.cornell.edu/People/pabo/movie-reviewdata/, created by...

متن کامل

A New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)

Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...

متن کامل

Cs 6604: Data Mining

In the last lecture we discussed the relationships between different modeling paradigms such as the Bayesian approach, Maximum A Posteriori (MAP) approach, Maximum Likelihood (ML) approach, and the Leastsquares (LS) method. In this lecture we first prove that equivalence of LS and ML under the assumption of normally distributed error. Then, the notions of the naive Bayesian classifier and the L...

متن کامل

The Indifferent Naive Bayes Classifier

The Naive Bayes classifier is a simple and accurate classifier. This paper shows that assuming the Naive Bayes classifier model and applying Bayesian model averaging and the principle of indifference, an equally simple, more accurate and theoretically well founded classifier can be obtained. Introduction In this paper we use Bayesian model averaging and the principle of indifference to derive a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012